Improving Pronoun Translation for Statistical Machine Translation
نویسنده
چکیده
Machine Translation is a well–established field, yet the majority of current systems translate sentences in isolation, losing valuable contextual information from previously translated sentences in the discourse. One important type of contextual information concerns who or what a coreferring pronoun corefers to (i.e., its antecedent). Languages differ significantly in how they achieve coreference, and awareness of antecedents is important in choosing the correct pronoun. Disregarding a pronoun’s antecedent in translation can lead to inappropriate coreferring forms in the target text, seriously degrading a reader’s ability to understand it. This work assesses the extent to which source-language annotation of coreferring pronouns can improve English–Czech Statistical Machine Translation (SMT). As with previous attempts that use this method, the results show little improvement. This paper attempts to explain why and to provide insight into the factors affecting performance.
منابع مشابه
Improving Pronoun Translation for Statistical Machine Translation (SMT)
Machine Translation is a well established field, yet the majority of current systems perform the translation of sentences in complete isolation, losing valuable contextual information from previously translated sentences in the discourse. One such class of contextual information concerns who or what it is that a reduced referring expression such as a pronoun is meant to refer to. The use of ina...
متن کاملImproving Pronoun Translation by Modeling Coreference Uncertainty
Information about the antecedents of pronouns is considered essential to solve certain translation divergencies, such as those concerning the English pronoun it when translated into gendered languages, e.g. for French into il, elle, or several other options. However, no machine translation system using anaphora resolution has so far been able to outperform a phrase-based statistical MT baseline...
متن کاملZero Pronoun Resolution can Improve the Quality of J-E Translation
In Japanese, particularly, spoken Japanese, subjective, objective and possessive cases are very often omitted. Such Japanese sentences are often translated by Japanese-English statistical machine translation to the English sentence whose subjective, objective and possessive cases are omitted, and it causes to decrease the quality of translation. We performed experiments of J-E phrase based tran...
متن کاملTranslating Pronouns with Latent Anaphora Resolution
We discuss the translation of anaphoric pronouns in statistical machine translation from English into French. Pronoun translation requires resolving the antecedents of the pronouns in the input, a classic discourse processing problem that is usually approached through supervised learning from manually annotated data. We cast cross-lingual pronoun prediction as a classification task and present ...
متن کاملParCor 1.0: A Parallel Pronoun-Coreference Corpus to Support Statistical MT
We present ParCor, a parallel corpus of texts in which pronoun coreference – reduced coreference in which pronouns are used as referring expressions – has been annotated. The corpus is intended to be used both as a resource from which to learn systematic differences in pronoun use between languages and ultimately for developing and testing informed Statistical Machine Translation systems aimed ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012